home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.c,comp.lang.perl
- Path: netcom.com!crewman
- From: crewman@netcom.com (Kevin L. Weller)
- Subject: Re: Random File Access - I don't get it
- Message-ID: <crewmanDpEEAy.LI6@netcom.com>
- Sender: crewman@netcom2.netcom.com
- Organization: Me, Myself, and I
- X-Newsreader: Forte Free Agent 1.0.82
- References: <3160DE1E.495C@teleport.com>
- Date: Fri, 5 Apr 1996 16:41:46 GMT
-
- Scott Kinard <kinards@teleport.com> wrote:
- > Suppose I have two files, one contains text (1 line of random length text per
- >'record') and another which is my index into the text file, which contains the starting
- >and ending byte positions in the text file for each 'record'. Now the question..
- >Suppose the text file has 2,000,000 entries. Now, what's the difference in reading
- >1,000,000 lines from the index file to find the starting and ending byte positions in
- >the text file for record 1,000,000, and then seeking these in the text file, rather than
- >just reading 1,000,000 times from the text file to get the same 'record' in the first
- >place?
-
- There may not be much advantage if the index is not sorted, unless the
- index records are very small. Chances are, though, if there is an
- index, it IS sorted. Depending on HOW it's sorted, you can either:
-
- 1) Use a divide-and-conquer approach (a binary search) on the index to
- find the index record (average time complexity base 2 log n), then
- perform one read from the text file (so on average, your
- 2,000,000-record example would take about base 2 log 2,000,000 + 1, or
- approximately 22 reads, a h*ll of a lot better than 1,000,000!).
- Somebody please correct my math if it's off.... :-)
-
- 2) If the index is organized by a hashing function, the whole process
- should take only two reads (assuming no key collisions): one to hash
- into the index, and the other to read the text file.
-
- Either case is tremendously better than the average n/2 reads
- (2,000,000 / 2 = 1,000,000 in this case) a sequential search of the
- text file would require.
-
- Be cool!
-
- ------------------------------------------------------------------------------
- Kevin L. Weller, a.k.a. Crewman!!! /-------+--------------------\
- crewman@netcom.com | aTm | GIG 'EM, AGGIES! |
- This is my personal (non-business) account \-------+--------------------/
- ------------------------------------------------------------------------------
- %SYS-E-BADOPSYS, Fatal system error, DEC VMS halting \ All wiyht. Rho sritched
- -SYS-I-GETUNIX, Replace with UNIX immediately! \ mg kegtops awound?
- ------------------------------------------------------------------------------
-
-